57 research outputs found

    Information Flow in Color Appearance Neural Networks

    Full text link
    Color Appearance Models are biological networks that consist of a cascade of linear+nonlinear layers that modify the linear measurements at the retinal photo-receptors leading to an internal (nonlinear) representation of color that correlates with psychophysical experience. The basic layers of these networks include: (1) chromatic adaptation (normalization of the mean and covariance of the color manifold), (2) change to opponent color channels (PCA-like rotation in the color space), and (3) saturating nonlinearities to get perceptually Euclidean color representations (similar to dimensionwise equalization). The Efficient Coding Hypothesis argues that these transforms should emerge from information-theoretic goals. In case this hypothesis holds in color vision, the question is, what is the coding gain due to the different layers of the color appearance networks? In this work, a representative family of Color Appearance Models is analyzed in terms of how the redundancy among the chromatic components is modified along the network and how much information is transferred from the input data to the noisy response. The proposed analysis is done using data and methods that were not available before: (1) new colorimetrically calibrated scenes in different CIE illuminations for proper evaluation of chromatic adaptation, and (2) new statistical tools to estimate (multivariate) information-theoretic quantities between multidimensional sets based on Gaussianization. Results confirm that the Efficient Coding Hypothesis holds for current color vision models, and identify the psychophysical mechanisms critically responsible for gains in information transference: opponent channels and their nonlinear nature are more important than chromatic adaptation at the retina

    Appropriate kernels for Divisive Normalization explained by Wilson-Cowan equations

    Get PDF
    Cascades of standard Linear+NonLinear-Divisive Normalization transforms [Carandini&Heeger12] can be easily fitted using the appropriate formulation introduced in [Martinez17a] to reproduce the perception of image distortion in naturalistic environments. However, consistently with [Rust&Movshon05], training the model in naturalistic environments does not guarantee the prediction of well known phenomena illustrated by artificial stimuli. For example, the cascade of Divisive Normalizations fitted with image quality databases has to be modified to include a variety aspects of masking of simple patterns. Specifically, the standard Gaussian kernels of [Watson&Solomon97] have to be augmented with extra weights [Martinez17b]. These can be introduced ad-hoc using the intuition to solve the empirical failures found in the original model, but it would be nice a better justification for this hack. In this work we give a theoretical justification of such empirical modification of the Watson&Solomon kernel based on the Wilson-Cowan [WilsonCowan73] model of cortical interactions. Specifically, we show that the analytical relation between the Divisive Normalization model and the Wilson-Cowan model proposed here leads to the kind of extra factors that have to be included and its qualitative dependence with frequency

    Functional Connectome of the Human Brain with Total Correlation

    Get PDF
    Recent studies proposed the use of Total Correlation to describe functional connectivity among brain regions as a multivariate alternative to conventional pairwise measures such as correlation or mutual information. In this work, we build on this idea to infer a large-scale (whole-brain) connectivity network based on Total Correlation and show the possibility of using this kind of network as biomarkers of brain alterations. In particular, this work uses Correlation Explanation (CorEx) to estimate Total Correlation. First, we prove that CorEx estimates of Total Correlation and clustering results are trustable compared to ground truth values. Second, the inferred large-scale connectivity network extracted from the more extensive open fMRI datasets is consistent with existing neuroscience studies, but, interestingly, can estimate additional relations beyond pairwise regions. And finally, we show how the connectivity graphs based on Total Correlation can also be an effective tool to aid in the discovery of brain diseases

    Towards a Functional Explanation of the Connectivity LGN - V1

    Get PDF
    The principles behind the connectivity between LGN and V1 are not well understood. Models have to explain two basic experimental trends: (i) the combination of thalamic responses is local and it gives rise to a variety of oriented Gabor-like receptive felds in V1 [1], and (ii) these filters are spatially organized in orientation maps [2]. Competing explanations of orientation maps use purely geometrical arguments such as optimal wiring or packing from LGN [3-5], but they make no explicit reference to visual function. On the other hand, explanations based on func- tional arguments such as maximum information transference (infomax) [6,7] usually neglect a potential contribution from LGN local circuitry. In this work we explore the abil- ity of the conventional functional arguments (infomax and variants), to derive both trends simultaneously assuming a plausible sampling model linking the retina to the LGN [8], as opposed to previous attempts operating from the retina. Consistently with other aspects of human vi- sion [14-16], additional constraints should be added to plain infomax to understand the second trend of the LGN-V1 con- nectivity. Possibilities include energy budget [11], wiring constraints [8], or error minimization in noisy systems, ei- ther linear [16] or nonlinear [14, 15]. In particular, consideration of high noise (neglected here) would favor the redundancy in the prediction (which would be required to match the relations between spatially neighbor neurons in the same orientation domain)

    PerceptNet:A Human Visual System Inspired Neural Network for Estimating Perceptual Distance

    Get PDF
    Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how the system processes different perturbations in order to replicate to what extent it determines our ability to judge image quality. While recent works have presented deep neural networks trained to predict human perceptual quality, very few borrow any intuitions from the human visual system. To address this, we present PerceptNet, a convolutional neural network where the architecture has been chosen to reflect the structure and various stages in the human visual system. We evaluate PerceptNet on various traditional perception datasets and note strong performance on a number of them as compared with traditional image quality metrics. We also show that including a nonlinearity inspired by the human visual system in classical deep neural networks architectures can increase their ability to judge perceptual similarity. Compared to similar deep learning methods, the performance is similar, although our network has a number of parameters that is several orders of magnitude less

    What You Hear Is What You See: Audio Quality Metrics From Image Quality Metrics

    Full text link
    In this study, we investigate the feasibility of utilizing state-of-the-art image perceptual metrics for evaluating audio signals by representing them as spectrograms. The encouraging outcome of the proposed approach is based on the similarity between the neural mechanisms in the auditory and visual pathways. Furthermore, we customise one of the metrics which has a psychoacoustically plausible architecture to account for the peculiarities of sound signals. We evaluate the effectiveness of our proposed metric and several baseline metrics using a music dataset, with promising results in terms of the correlation between the metrics and the perceived quality of audio as rated by human evaluators

    Derivatives and Inverse of a Linear-Nonlinear Multi-Layer Spatial Vision Model

    Get PDF
    Analyzing the mathematical properties of perceptually meaningful linear-nonlinear transforms is interesting because this computation is at the core of many vision models. Here we make such analysis in detail using a specific model [Malo & Simoncelli, SPIE Human Vision Electr. Imag. 2015] which is illustrative because it consists of a cascade of standard linear-nonlinear modules. The interest of the analytic results and the numerical methods involved transcend the particular model because of the ubiquity of the linear-nonlinear structure. Here we extend [Malo&Simoncelli 15] by considering 4 layers: (1) linear spectral integration and nonlinear brightness response, (2) definition of local contrast by using linear filters and divisive normalization, (3) linear CSF filter and nonlinear local con- trast masking, and (4) linear wavelet-like decomposition and nonlinear divisive normalization to account for orientation and scale-dependent masking. The extra layers were measured using Maximum Differentiation [Malo et al. VSS 2016]. First, we describe the general architecture using a unified notation in which every module is composed by isomorphic linear and nonlinear transforms. The chain-rule is interesting to simplify the analysis of systems with this modular architecture, and invertibility is related to the non-singularity of the Jacobian matrices. Second, we consider the details of the four layers in our particular model, and how they improve the original version of the model. Third, we explicitly list the derivatives of every module, which are relevant for the definition of perceptual distances, perceptual gradient descent, and characterization of the deformation of space. Fourth, we address the inverse, and we find different analytical and numerical problems in each specific module. Solutions are proposed for all of them. Finally, we describe through examples how to use the toolbox to apply and check the above theory. In summary, the formulation and toolbox are ready to explore the geometric and perceptual issues addressed in the introductory section (giving all the technical information that was missing in [Malo&Simoncelli 15])
    corecore